Artificial Intelligence for Vision Systems: Opportunities and Possibilities
By Eng. (Dr.) G. M. R. I. GodaliyaddaComputer Vision: What and Why?
As most of us already know the primary task Artificial Intelligence (AI) is the ability to make machines attain human-like characteristics. This has led to a surge of interest towards this field in the recent past. Tasks such as making a machine "see", "hear and speak", and "mimic movements" in a "humanlike" manner, can be viewed as the main categories of subtasks handled by engineers in this field. Our attempt to perform these subtasks, have resulted in the emergence of multiple sub areas of research and development within AI, as illustrated in figure 1.
While all these areas are of tremendous interest to us as Engineers. One area, aptly named "Computer Vision," which deals with the interpretation and understanding of the visual world, stands out to me as an area of great opportunity for Engineering practice, especially in the light of recent developments. I wish to draw attention specifically to this aspect of AI as it is often under looked by Sri Lankans, while there is tremendous interest in the global community. It is one of the most heavily funded areas for R&D in the developed world.
Figure 1: A Task based Classification of AI
The horrendous "Easter attacks" of last year and the ongoing global pandemic has realigned the priorities of the world and our nation. Many of the new engineering tasks or problems that have emerged as a result can be solved utilizing computer vision and AI based techniques. These new challenges highlight in many ways the need for a multidisciplinary approach to solving practical engineering problems. There is a clear demand for interpretation of visual information for smart surveillance and remote monitoring applications as a direct consequence, while rise to many other vision applications in an indirect manner.
The Key Tasks of Computer Vision AI Systems
Some of the tasks performed by Computer vision provides an insight as to why this area is of interest to us as Engineers in the current environment, while also enabling us to understand how this field can assist us to solve a multitude of problems in multiple Engineering fields. At its core Computer Vision uses visual inputs to develop models and algorithms that can accurately:
- Classify: Classification attempts to Label whole photographs or video stills by broadly categorizing them according to already given labels or classes. This same action can be performed within a single image where the classification is performed on different objects identified inside a single image. Either way it is supervised as the class labels are provided prior to the learning phase.
- Cluster: Clustering is a general task performed by many AI systems including Computer Vision based algorithms which will attempt to group or cluster images as a whole or objects within an image when no labels are provided. Hence, called unsupervised learning as the algorithm itself has to make sense of the visual data when no labels are provided by the user. As groups or clusters form, those that are isolated, that stand alone form the "out of place" clusters that might be of interest for security or surveillance purposes.
Figure 2: Image Classification and Clustering as performed by a Computer Vision AI: Credits Western Digital
Application of Computer Vision to Real World Applications
All complex tasks performed by Computer Vision algorithms rely on those two principle actions, "clustering and classification", and builds upon them based on context. What is fascinating about such Vision algorithms is its ability to work with other areas to interpret the scene (image or video still), hence, perform human like judgement based on visual inputs such as images or video frames. For example:
- Vision can be used to identify objects in an image. This can be used in surveillance applications to first identify a "person of interest" then track this person's movement through available CCTV footage.
- Vision systems can identify and classify this is useful for applications such as, categorization people into those who are wearing masks as opposed to those who are not. Automatic detection of individual violations of regulations in an automated manner.
- The ability of Vision systems to identify the interactions and spacing of individuals enables it to detect the proximity of humans and their level of interaction from video surveillance footage. This will enable automated detection of Social Distancing violations.
Figure 3: Smart Surveillance through Computer Vision: Credits IndustryWired
These are just a few of the numerous applications of Computer Vision. Others span areas such as cloud motion prediction for micro level weather forecasting for solar PV input prediction, utilization of satellite images for harvest estimation of agricultural produce, mineral deposit localization via hyper spectral imaging, coastal erosion estimation via remote monitoring, food and textile quality estimation, partial discharge monitoring in high voltage systems and autonomous navigation systems etc. As can be noticed the applications of vision systems are wide ranging and span all areas of engineering.
The need for capitalization
As the world attempts to adjust to a new normal there is opportunity for new industry. Those who cease the moment are the ones who capitalize on change. For example, the video conferencing and online learning systems development industries have capitalized on the changes that have taken form due to the ongoing pandemic. The organizations that lead the charge in such innovations stand to gain the most.
As many of these Computer Vision systems can be developed with minimal hardware resources, while providing situational solutions to current issues, provides us with a unique opportunity. The fact that these algorithms attempts to make sense out of the visual world in an automated sense makes remote monitoring and management far more convenient and economical.
This article attempts to explain the numerous opportunities that lie in the emerging area of computer vision-based AI in a simple and succinct manner, while hopefully encouraging all of us to cease the moment.
Eng. (Dr.) G. M. R. I. Godaliyadda
Senior Lecturer,
Department of Electrical and Electronic Engineering,
Faculty of Engineering,
University of Peradeniya.